Bilingually-Guided Monolingual Dependency Grammar Induction
نویسندگان
چکیده
This paper describes a novel strategy for automatic induction of a monolingual dependency grammar under the guidance of bilingually-projected dependency. By moderately leveraging the dependency information projected from the parsed counterpart language, and simultaneously mining the underlying syntactic structure of the language considered, it effectively integrates the advantages of bilingual projection and unsupervised induction, so as to induce a monolingual grammar much better than previous models only using bilingual projection or unsupervised induction. We induced dependency grammar for five different languages under the guidance of dependency information projected from the parsed English translation, experiments show that the bilinguallyguided method achieves a significant improvement of 28.5% over the unsupervised baseline and 3.0% over the best projection baseline on average.
منابع مشابه
Toward Statistical Machine Translation without Parallel Corpora
We estimate the parameters of a phrasebased statistical machine translation system from monolingual corpora instead of a bilingual parallel corpus. We extend existing research on bilingual lexicon induction to estimate both lexical and phrasal translation probabilities for MT-scale phrasetables. We propose a novel algorithm to estimate reordering probabilities from monolingual data. We report t...
متن کاملShared Logistic Normal Distributions for Soft Parameter Tying in Unsupervised Grammar Induction
We present a family of priors over probabilistic grammar weights, called the shared logistic normal distribution. This family extends the partitioned logistic normal distribution, enabling factored covariance between the probabilities of different derivation events in the probabilistic grammar, providing a new way to encode prior knowledge about an unknown grammar. We describe a variational EM ...
متن کاملReranking Bilingually Extracted Paraphrases Using Monolingual Distributional Similarity
This paper improves an existing bilingual paraphrase extraction technique using monolingual distributional similarity to rerank candidate paraphrases. Raw monolingual data provides a complementary and orthogonal source of information that lessens the commonly observed errors in bilingual pivotbased methods. Our experiments reveal that monolingual scoring of bilingually extracted paraphrases has...
متن کاملBilingually-Constrained (Monolingual) Shift-Reduce Parsing
Jointly parsing two languages has been shown to improve accuracies on either or both sides. However, its search space is much bigger than the monolingual case, forcing existing approaches to employ complicated modeling and crude approximations. Here we propose a much simpler alternative, bilingually-constrained monolingual parsing, where a source-language parser learns to exploit reorderings as...
متن کاملHybrid text simplification using synchronous dependency grammars with hand-written and automatically harvested rules
We present an approach to text simplification based on synchronous dependency grammars. The higher level of abstraction afforded by dependency representations allows for a linguistically sound treatment of complex constructs requiring reordering and morphological change, such as conversion of passive voice to active. We present a synchronous grammar formalism in which it is easy to write rules ...
متن کامل